60 research outputs found
Recommended from our members
Single subject transcriptome analysis to identify functionally signed gene set or pathway activity
Analysis of single-subject transcriptome response data is an unmet need of precision medicine, made challenging by the high dimension, dynamic nature and difficulty in extracting meaningful signals from biological or stochastic noise. We have proposed a method for single subject analysis that uses a mixture model for transcript fold-change clustering from isogenically paired samples, followed by integration of these distributions with Gene Ontology Biological Processes (GO-BP) to reduce dimension and identify functional attributes. We then extended these methods to develop functional signing metrics for gene set process regulation by incorporating biological repressor relationships encoded in GO-BP as negatively regulates edges. Results revealed reproducible and biologically meaningful signals from analysis of a single subject's response, opening the door to future transcriptomic studies where subject and resource availability are currently limiting. We used inbred mouse strains fed different diets to provide isogenic biological replicates, permitting rigorous validation of our method. We compared significant genotype-specific GO-BP term results for overlap and rank order across three replicate pairs per genotype, and cross-methods to reference standards (limma+FET, SAM+FET, and GSEA). All single-subject analytics findings were robust and highly reproducible (median area under the ROC curve=0.96, n=24 genotypes x 3 replicates), providing confidence and validation of this approach for analyses in single subjects. R code is available online at http://www.lussiergroup.org/publications/PathwayActivityUniversity of Arizona Health Sciences CB2, the BIO5 Institute; NIH [U01AI122275, HL132532, CA023074, 1UG3OD023171, 1R01AG053589-01A1, 1S10RR029030]Open access journalThis item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at [email protected]
Recommended from our members
Developing a 'personalome' for precision medicine: emerging methods that compute interpretable effect sizes from single-subject transcriptomes
The development of computational methods capable of analyzing -omics data at the individual level is critical for the success of precision medicine. Although unprecedented opportunities now exist to gather data on an individual's -omics profile (personalome'), interpreting and extracting meaningful information from single-subject -omics remain underdeveloped, particularly for quantitative non-sequence measurements, including complete transcriptome or proteome expression and metabolite abundance. Conventional bioinformatics approaches have largely been designed for making population-level inferences about average' disease processes; thus, they may not adequately capture and describe individual variability. Novel approaches intended to exploit a variety of -omics data are required for identifying individualized signals for meaningful interpretation. In this review-intended for biomedical researchers, computational biologists and bioinformaticians-we survey emerging computational and translational informatics methods capable of constructing a single subject's personalome' for predicting clinical outcomes or therapeutic responses, with an emphasis on methods that provide interpretable readouts. Key points: (i) the single-subject analytics of the transcriptome shows the greatest development to date and, (ii) the methods were all validated in simulations, cross-validations or independent retrospective data sets. This survey uncovers a growing field that offers numerous opportunities for the development of novel validation methods and opens the door for future studies focusing on the interpretation of comprehensive personalomes' through the integration of multiple -omics, providing valuable insights into individual patient outcomes and treatments.National Institute of Health (NIH)/Office of the Director Precision Medicine Initiative [1UG3OD023171-01]; Precision Medicine Initiative of the Center for Biomedical Informatics and Biostatistics of the University of Arizona Health Sciences; NIH/National Heart, Lung, and Blood Institute [HL126609-01, HL132523, U01 HL125208]; NIH/National Cancer Institute [P30CA023074, 1R01CA190696-01]; NIH/National Institute of Allergy and Infectious Diseases [U01AI122275-01]Open access articleThis item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at [email protected]
Precision drug repurposing via convergent eQTL-based molecules and pathway targeting independent disease-associated polymorphisms
Repurposing existing drugs for new therapeutic indications can improve success rates and streamline development. Use of large-scale biomedical data repositories, including eQTL regulatory relationships and genome-wide disease risk associations, offers opportunities to propose novel indications for drugs targeting common or convergent molecular candidates associated to two or more diseases. This proposed novel computational approach scales across 262 complex diseases, building a multi-partite hierarchical network integrating (i) GWAS-derived SNP-to-disease associations, (ii) eQTL-derived SNP-to-eGene associations incorporating both cis-and trans-relationships from 19 tissues, (iii) protein target-to-drug, and (iv) drug-to-disease indications with (iv) Gene Ontology-based information theoretic semantic (ITS) similarity calculated between protein target functions. Our hypothesis is that if two diseases are associated to a common or functionally similar eGene -and a drug targeting that eGene/protein in one disease exists - the second disease becomes a potential repurposing indication. To explore this, all possible pairs of independently segregating GWAS-derived SNPs were generated, and a statistical network of similarity within each SNP-SNP pair was calculated according to scale-free overrepresentation of convergent biological processes activity in regulated eGenes (ITSeGENE-eGENE) and scale-free overrepresentation of common eGene targets between the two SNPs (ITSSNP-SNP). Significance of ITSSNP-SNP was conservatively estimated using empirical scale-free permutation resampling keeping the node-degree constant for each molecule in each permutation. We identified 26 new drug repurposing indication candidates spanning 89 GWAS diseases, including a potential repurposing of the calcium-channel blocker Verapamil from coronary disease to gout. Predictions from our approach are compared to known drug indications using DrugBank as a gold standard (odds ratio=13.1, p-value=2.49x10(-8)). Because of specific disease-SNPs associations to candidate drug targets, the proposed method provides evidence for future precision drug repositioning to a patient's specific polymorphisms.University of Arizona Health Sciences CB2; BIO5 Institute; UA Cancer Center; NIH [U01AI122275]Open access journalThis item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at [email protected]
Recommended from our members
Evaluating single-subject study methods for personal transcriptomic interpretations to advance precision medicine
Background Gene expression profiling has benefited medicine by providing clinically relevant insights at the molecular candidate and systems levels. However, to adopt a more precision' approach that integrates individual variability including omics data into risk assessments, diagnoses, and therapeutic decision making, whole transcriptome expression needs to be interpreted meaningfully for single subjects. We propose an all-against-one framework that uses biological replicates in isogenic conditions for testing differentially expressed genes (DEGs) in a single subject (ss) in the absence of an appropriate external reference standard or replicates. To evaluate our proposed all-against-one framework, we construct reference standards (RSs) with five conventional replicate-anchored analyses (NOISeq, DEGseq, edgeR, DESeq, DESeq2) and the remainder were treated separately as single-subject sample pairs for ss analyses (without replicates).ResultsEight ss methods (NOISeq, DEGseq, edgeR, mixture model, DESeq, DESeq2, iDEG, and ensemble) for identifying genes with differential expression were compared in Yeast (parental line versus snf2 deletion mutant; n=42/condition) and a MCF7 breast-cancer cell line (baseline versus stimulated with estradiol; n=7/condition). Receiver-operator characteristic (ROC) and precision-recall plots were determined for eight ss methods against each of the five RSs in both datasets. Consistent with prior analyses of these data, similar to 50% and similar to 15% DEGs were obtained in Yeast and MCF7 datasets respectively, regardless of the RSs method. NOISeq, edgeR, and DESeq were the most concordant for creating a RS. Single-subject versions of NOISeq, DEGseq, and an ensemble learner achieved the best median ROC-area-under-the-curve to compare two transcriptomes without replicates regardless of the RS method and dataset (>90% in Yeast, >0.75 in MCF7). Further, distinct specific single-subject methods perform better according to different proportions of DEGs.ConclusionsThe all-against-one framework provides a honest evaluation framework for single-subject DEG studies since these methods are evaluated, by design, against reference standards produced by unrelated DEG methods. The ss-ensemble method was the only one to reliably produce higher accuracies in all conditions tested in this conservative evaluation framework. However, single-subject methods for identifying DEGs from paired samples need improvement, as no method performed with precision>90% and obtained moderate levels of recall.University of Arizona Health Sciences Center for Biomedical Informatics and Biostatistics; BIO5 Institute; NIH [U01AI122275, HL132532, NCI P30CA023074, 1UG3OD023171, 1S10RR029030]Open access journalThis item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at [email protected]
Recommended from our members
Reading Between the Genes: Computational Models to Discover Function from Noncoding DNA
Noncoding DNA - once called "junk" has revealed itself to be full of function. Technology development has allowed researchers to gather genome-scale data pointing towards complex regulatory regions, expression and function of noncoding RNA genes, and conserved elements. Variation in these regions has been tied to variation in biological function and human disease. This PSB session tackles the problem of handling, analyzing and interpreting the data relating to variation in and interactions between noncoding regions through computational biology. We feature an invited speaker to how variation in transcription factor coding sequences impacts on sequence preference, along with submitted papers that span graph based methods, integrative analyses, machine learning, and dimension reduction to explore questions of basic biology, cancer, diabetes, and clinical relevance.University of Arizona Health Sciences CB2, the BIO5 Institute; NIH [U01AI122275, HL132532, CA023074, 1UG3OD023171, 1R01AG053589-01A1, 1S10RR029030]Open access journalThis item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at [email protected]
Workshop during the Pacific Symposium of Biocomputing, Jan 3-7, 2019: Reading between the genes: interpreting non-coding DNA in high-throughput
Identifying functional elements and predicting mechanistic insight from non-coding DNA and non-coding variation remains a challenge. Advances in genome-scale, high-throughput technology, however, have brought these answers closer within reach than ever, though there is still a need for new computational approaches to analysis and integration. This workshop aims to explore these resources and new computational methods applied to regulatory elements, chromatin interactions, non-protein-coding genes, and other non-coding DNA.Open access journalThis item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at [email protected]
C5 deficiency and C5a or C5aR blockade protects against cerebral malaria
Experimental infection of mice with Plasmodium berghei ANKA (PbA) provides a powerful model to define genetic determinants that regulate the development of cerebral malaria (CM). Based on the hypothesis that excessive activation of the complement system may confer susceptibility to CM, we investigated the role of C5/C5a in the development of CM. We show a spectrum of susceptibility to PbA in a panel of inbred mice; all CM-susceptible mice examined were found to be C5 sufficient, whereas all C5-deficient strains were resistant to CM. Transfer of the C5-defective allele from an A/J (CM resistant) onto a C57BL/6 (CM-susceptible) genetic background in a congenic strain conferred increased resistance to CM; conversely, transfer of the C5-sufficient allele from the C57BL/6 onto the A/J background recapitulated the CM-susceptible phenotype. The role of C5 was further explored in B10.D2 mice, which are identical for all loci other than C5. C5-deficient B10.D2 mice were protected from CM, whereas C5-sufficient B10.D2 mice were susceptible. Antibody blockade of C5a or C5a receptor (C5aR) rescued susceptible mice from CM. In vitro studies showed that C5a-potentiated cytokine secretion induced by the malaria product P. falciparum glycosylphosphatidylinositol and C5aR blockade abrogated these amplified responses. These data provide evidence implicating C5/C5a in the pathogenesis of CM
Analysis of protein-coding genetic variation in 60,706 humans
Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. We describe the aggregation and analysis of high-quality exome (protein-coding region) sequence data for 60,706 individuals of diverse ethnicities generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of truncating variants with 72% having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human “knockout” variants in protein-coding genes
- …